Linear Probability Models in Information Systems Research

نویسندگان

  • Galit Shmueli
  • Suneel Chatla
چکیده

Sizes of datasets used in IS research are growing quickly due to data available from digital technologies such as mobile, RFID, sensors, online markets, and more. It is not uncommon to see studies using tens and hundreds of thousands or even millions of records. Linear regression is among the most popular statistical model in social sciences research. Linear probability models, which are linear regression models applied to a binary outcome, are commonly used in many social science disciplines, despite criticisms of such usage. Surprisingly, LPMs are rare in the IS literature, where logit and probit regression models are typically used for binary outcomes. Whether LPMs provide value or constitute an abuse has been discussed only for specific aspects. A thorough and broad evaluation of their pros and cost for different goals in different scenarios is missing. We carry out an extensive study to evaluate the advantages and dangers of LPMs, especially in the realm of Big Data that now affects IS research, where large samples and many variables are available. We evaluate performance in terms of coefficient estimation as well as predictive power. We compare performance to alternatives suggested in the literature. We find that the LPM is beneficial for explanatory modeling when the outcome is naturally binary, whereas it is beneficial for predictive modeling when the outcome is binary by dichotomization. In large-sample studies IS researchers should consider LPMs for purposes of coefficient estimation if the outcome is naturally binary, but not if it is dichotomized. For predictive purposes, LPM should be considered even with small samples. We motivate and illustrate our study using a large dataset on online auctions from eBay.

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

FUZZY INFORMATION AND STOCHASTICS

In applications there occur different forms of uncertainty. The twomost important types are randomness (stochastic variability) and imprecision(fuzziness). In modelling, the dominating concept to describe uncertainty isusing stochastic models which are based on probability. However, fuzzinessis not stochastic in nature and therefore it is not considered in probabilisticmodels.Since many years t...

متن کامل

New Realities of the Enterprise Management System Information Support: Economic and Mathematical Models and Cloud Technologies

The paper focuses on the urgency of the implementation of cloud technologies, which are a necessary condition for the development of enterprise management systems, give rise to a complex of insufficiently studied phenomena and processes and determine the need to find new tools in making and implementing reasonable management decisions. In the process of research, the sequence of construction an...

متن کامل

Potentials of Evolving Linear Models in Tracking Control Design for Nonlinear Variable Structure Systems

Evolving models have found applications in many real world systems. In this paper, potentials of the Evolving Linear Models (ELMs) in tracking control design for nonlinear variable structure systems are introduced. At first, an ELM is introduced as a dynamic single input, single output (SISO) linear model whose parameters as well as dynamic orders of input and output signals can change through ...

متن کامل

A new multi-step ABS model to solve full row rank linear systems

ABS methods are direct iterative methods for solving linear systems of equations, where the i-th iteration satisfies the first i equations. Thus, a system of m equations is solved in at most m ABS iterates. In 2004 and 2007, two-step ABS methods were introduced in at most [((m+1))/2] steps to solve full row rank linear systems of equations. These methods consuming less space, are more compress ...

متن کامل

Extension of Cube Attack with Probabilistic Equations and its Application on Cryptanalysis of KATAN Cipher

Cube Attack is a successful case of Algebraic Attack. Cube Attack consists of two phases, linear equation extraction and solving the extracted equation system. Due to the high complexity of equation extraction phase in finding linear equations, we can extract nonlinear ones that could be approximated to linear equations with high probability. The probabilistic equations could be considered as l...

متن کامل

Minimum Information Updating with Specified Marginals in Probabilistic Expert Systems

A probability-updating method in probabilistic expert systems is considered in this paper based on the minimum discrimination information. Here, newly acquired information is taken as the latest true marginal probabilities, not as newly observed data with the same weight as previous data. Posterior probabilities are obtained by updating prior probabilities subject to the latest true marginals. ...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 2014